Resource allocation and utilization in the Blue Gene/L supercomputer
نویسندگان
چکیده
This paper describes partition allocation for parallel jobs in the Blue Genet/L supercomputer. It describes the novel network architecture of the Blue Gene/L (BG/L) three-dimensional (3D) computational core and presents a preliminary analysis of its properties and advantages compared those of with more traditional systems. The scalability challenge is solved in BG/L by sacrificing granularity of system management. The system is treated as a collection of composite allocation units that contain both processing and communication resources. We discuss the ensuing algorithmic framework for computational and communication resource allocation and present results of simulations that explore resource utilization of BG/L for different workloads. We find that utilization depends strongly on both the predominant partition topology (mesh or torus) and the 3D shapes requested by the running jobs. When communication links are treated as dedicated resources, it is much more difficult to allocate toroidal partitions than mesh ones, especially for jobs of more than one allocation unit in each dimension. We show that in these difficult cases, the advantage of BG/L compared with a 3D toroidal machine of the same size is very significant, with resource utilization better by a factor of 2. In the easier cases (e.g., predominantly mesh partitions), there are no disadvantages. The advantage is primarily due to the BG/L novel multi-toroidal topology that permits coallocation of multiple toroidal partitions at negligible additional cost.
منابع مشابه
FRA-PSO: A two-stage Resource Allocation Algorithm in Cloud Computing
Cloud computing gives a large quantity of processing possibilities and heterogeneous resources, meeting the prerequisites of numerous applications at diverse levels. Therefore, resource allocation is vital in cloud computing. Resource allocation is a technique that resources such as CPU, RAM, and disk in cloud data centers are divided among cloud users. The resource utilization, cloud service p...
متن کاملAn Intelligent Algorithm for Optimization of Resource Allocation Problem by Considering Human Error in an Emergency Department
Human error is a significant and ever-growing problem in the healthcare sector. In this study, resource allocation problem is considered along with human errors to optimize utilization of resources in an emergency department. The algorithm is composed of simulation, artificial neural network (ANN), design of experiment (DOE) and fuzzy data envelopment analysis (FDEA). It is a multi-response opt...
متن کاملMulti-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers
Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In par...
متن کاملA Multi Objective Fibonacci Search Based Algorithm for Resource Allocation in PERT Networks
The problem we investigate deals with the optimal assignment of resources to the activities of a stochastic project network. We seek to minimize the expected cost of the project include sum of resource utilization costs and lateness costs. We assume that the work content required by the activities follows an exponential distribution. The decision variables of the model are the allocated resourc...
متن کاملImprove Resource Utilization by Task Scheduling in Cluster Computing
Paralleland distributed computing technique is the solution of many high performance computing requirementsin business purpose and research operations. Since parallel computing became useable numbers scheduling algorithms proposed for performance improvement and resource utilization. In this paper we consider scheduling problem for resource allocation to tasks in cluster computing. In addition ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IBM Journal of Research and Development
دوره 49 شماره
صفحات -
تاریخ انتشار 2005